{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/embeddings/huggingface.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Optimized Embedding Model using Optimum-Intel\n",
    "\n",
    "LlamaIndex has support for loading quantized embedding models for Intel, using the [Optimum-Intel library](https://huggingface.co/docs/optimum/main/en/intel/index). \n",
    "\n",
    "Optimized models are smaller and faster, with minimal accuracy loss, see the [documentation](https://huggingface.co/docs/optimum/main/en/intel/optimization_inc) and an [optimization guide](https://huggingface.co/docs/optimum/main/en/intel/optimization_inc) using the IntelLabs/fastRAG library. \n",
    "\n",
    "Optimization is based on math instructions in the Xeon® 4th generation or newer processors. \n",
    "\n",
    "In order to be able to load and use the quantized models, install the required dependency `pip install optimum[exporters] optimum-intel neural-compressor intel_extension_for_pytorch`. \n",
    "\n",
    "Loading is done using the class `IntelEmbedding`; usage is similar to any HuggingFace local embedding model; See example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install llama-index-embeddings-huggingface-optimum-intel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.embeddings.huggingface_optimum_intel import IntelEmbedding\n",
    "\n",
    "embed_model = IntelEmbedding(\"Intel/bge-small-en-v1.5-rag-int8-static\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "384\n",
      "[-0.0032782123889774084, -0.013396517373621464, 0.037944991141557693, -0.04642259329557419, 0.027709005400538445]\n"
     ]
    }
   ],
   "source": [
    "embeddings = embed_model.get_text_embedding(\"Hello World!\")\n",
    "print(len(embeddings))\n",
    "print(embeddings[:5])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
